A Staged Approach using Machine Learning and Uncertainty Quantification to Predict the Risk of Hip Fracture

Hip fractures present a significant healthcare challenge, especially within aging populations, where they are often caused by falls. These fractures lead to substantial morbidity and mortality, emphasizing the need for timely surgical intervention. Despite advancements in medical care, hip fractures impose a significant burden on individuals and healthcare systems. This paper focuses on the prediction of hip fracture risk in older and middle-aged adults, where falls and compromised bone quality are predominant factors. We propose a novel staged model that combines advanced imaging and clinical data to improve predictive performance. By using convolutional neural networks (CNNs) to extract features from hip DXA images, along with clinical variables, shape measurements, and texture features, our method provides a comprehensive framework for assessing fracture risk. The study cohort included 547 patients, with 94 experiencing hip fracture. A staged machine learning-based model was developed using two ensemble models: Ensemble 1 (clinical variables only) and Ensemble 2 (clinical variables and DXA imaging features). This staged approach used uncertainty quantification from Ensemble 1 to decide if DXA features are necessary for further prediction. Ensemble 2 exhibited the highest performance, achieving an Area Under the Curve (AUC) of 0.9541, an accuracy of 0.9195, a sensitivity of 0.8078, and a specificity of 0.9427. The staged model also performed well, with an AUC of 0.8486, an accuracy of 0.8611, a sensitivity of 0.5578, and a specificity of 0.9249, outperforming Ensemble 1, which had an AUC of 0.5549, an accuracy of 0.7239, a sensitivity of 0.1956, and a specificity of 0.8343. Furthermore, the staged model suggested that 54.49% of patients did not require DXA scanning. It effectively balanced accuracy and specificity, offering a robust solution when DXA data acquisition is not always feasible. Statistical tests confirmed significant differences between the models, highlighting the advantages of the advanced modeling strategies. Our staged approach offers a cost-effective holistic view of patients’ health. It could identify individuals at risk with a high accuracy but reduce the unnecessary DXA scanning. Our approach has great promise to guide interventions to prevent hip fractures with reduced cost and radiation.


INTRODUCTION
Hip fractures present a significant healthcare challenge, particularly among aging populations where they are often precipitated by falls.With the global aging trend, the incidence of hip fractures is expected to rise dramatically in the coming decades.For instance, while the annual global incidence was 1.3 million in 1990, it is projected to surge to a staggering 7 to 21 million by 2050 [1].In the United States alone, the annual incidence per 100,000 individuals ranges between 197 to 201 for men and 511 to 553 for women, with rates increasing significantly with age [2].These incidents have serious consequences on quality of life.Apart from causing morbidity and mortality, hip fractures impose a substantial economic burden.Patients often face approximately $40,000 in expenses within the first-year post-fracture, while the collective annual cost in the US alone surpasses $17 billion.
Diagnosing hip fractures hinges on a meticulous clinical evaluation, typically initiated by history of falls resulting in hip pain and restricted mobility.However, the diagnostic process extends beyond mere fracture identification, encompassing a comprehensive assessment of underlying medical conditions, social circumstances, and cognitive function, all of which profoundly impact patient care and prognosis.While surgical intervention remains the primary treatment modality for most hip fractures, the timing of surgery is critical for effective pain management and functional restoration.Ensuring optimal preparation for surgery, especially among elderly individuals with intricate medical requirements, is essential for minimizing perioperative risks and maximizing postoperative outcomes.Thus, the diagnostic and treatment approach for hip fractures encompasses not only fracture management but also comprehensive patient-centered care aimed at maximizing functional recovery and quality of life [3].
Bone mineral density (BMD) is a key determinant of hip fracture risk.Dual-energy X-ray absorptiometry (DXA) plays a pivotal role in assessing BMD and fracture risk.DXA serves as the standard imaging modality guiding clinical decisions for the detection, initiation of treatment, and follow-up of individuals with osteoporosis and fracture risk.Recent studies have explored innovative approaches, such as artificial intelligence (AI), to enhance hip fracture risk prediction by leveraging DXA imaging alongside clinical data.Lex et al. [4] conducted a thorough investigation into the diagnostic accuracy of artificial intelligence (AI) models in diagnosing hip fractures on radiographs and predicting postoperative clinical outcomes following hip fracture surgery relative to current practices.Their systematic review and meta-analysis of 39 studies revealed that AI models perform comparably to expert clinicians in diagnosing hip fractures.Cha et al. [5] systematically reviewed the use of AI and machine learning (ML) in diagnosing and classifying hip fractures, demonstrating high accuracy and effectiveness in clinical settings.Furthermore, Murphy et al. [6] utilized two sets of radiographs: one from population without hip fractures collected as part of a bone mass study and another from those who had hip fractures from local National Hip Fracture Database (NHFD) audit records.Their study demonstrated that a trained neural network exhibits a remarkable 19% increase in accuracy in classifying hip fractures compared to experienced human observers within clinical settings.This finding underscores the transformative impact of ML technologies in augmenting the capabilities of healthcare professionals and improving patient outcomes in orthopedic care.Zhao et al. [7] introduced multiview variational autoencoder (MVAE) and product of expert (PoE) models for predicting proximal femoral fracture loads by integrating whole-genome sequence features and DXA-derived imaging features.Additionally, Hong et al. [8] developed a bone radiomics score using a random forest model and texture analysis of DXA hip images, in predicting incident hip fractures.Despite advancements, current ML and AI approaches for predicting hip fractures have notable limitations.Some often utilize only a single modality of data, either clinical or imaging, which can lead to limited predictive accuracy.Moreover, most multi-modality ML methods require all modalities to be obtained in advance for effective prediction, adding the cost, radiation and complexity to the diagnostic process.Notably, the actual process of obtaining these modalities perhaps can be made sequential in clinic practice depending on our results in order to be more efficient and economical.
To address these limitations, our study introduces a novel staged modeling approach aimed at predicting hip fractures.Unlike current methods, our approach is structured into two distinct stages.In the first stage, we focus solely on clinical features.Subsequently, in the second stage, we expand our analysis to incorporate imaging features extracted from hip DXA images.By integrating both clinical and imaging data with ML and uncertainty quantification, this staged approach aims to enhance prediction accuracy and adaptability to diverse clinical scenarios.

MATERIALS AND METHODS
The dataset utilized in this study was sourced from the UK Biobank (application ID: 61915), representing a valuable resource for investigating bone health parameters.DXA imaging, essential for evaluating BMD and morphology, was performed by trained radiographers using the GE-Lunar iDXA instrument.Regular calibration of this instrument to a manufacturer's phantom (GE-Lunar, Madison, WI) and daily quality control procedures ensured the accuracy and reliability of DXA measurements [9].This comprehensive DXA dataset covers various anatomical regions, including the whole body, lateral thoraco-lumbar spine, and bilateral hips and knees.Focusing on a subset of 547 patients with DXA hip images, we manually annotated the left and right contours of femur.to isolate the femur, the region of interest, from the raw images.Additionally, we incorporated other relevant clinical features into our analysis.Among the subset of patients with DXA hip images, 94 individuals had fractures (including 40 males), while the majority (n= 453) were nonfractured individuals (with 226 males).The ethnicity of all patients in our sample is British.

Clinical factors
Our study considered a plethora of variables crucial for understanding various aspects of participants' health profiles.These variables encompassed demographic details such as age at recruitment, sex, and genetic sex, alongside anthropometric measurements like weight.Additionally, information regarding participants' average total household income before tax, and lifestyle factors including smoking and alcohol consumption statuses were considered (Supp.Table 1).We also examined dietary habits, including the variation in diet and major dietary changes in the last 5 years, along with occurrences of falls and bone fractures in the past year and 5 years, respectively.Furthermore, the dataset also provided intricate measurements of BMD and bone mineral content (BMC) at various anatomical sites, shedding light on participants' bone health.The regular intake of vitamin and mineral supplements was also documented.This comprehensive array of clinical data facilitated a thorough exploration of factors influencing participants' health and enabled meaningful insights into bone health and related risk factors.

Model evaluation
In our study focused on predicting hip fracture risk, we propose an advanced staged modeling approach meticulously designed to enhance predictive accuracy while concurrently minimizing clinical costs and procedural time.Inspired by the sequential decision-making processes commonly observed in clinical practice, our methodology incorporates advanced techniques to optimize model performance (Figure 1.).To ensure consistency across the dataset, DXA images are standardized to a size of 224 pixels.Our model evaluation process begins with an analysis of hip DXA images, which are meticulously annotated to delineate anatomical outlines, serving as the primary input data.Additionally, we integrate crucial clinical data, including age, weight, sex, alcohol consumption, smoking status, dietary changes due to illness in the last 5 years, how often diet varies week to week, falls within the last year, history of fractured or broken bones in the last 5 years, and average household income, into our analysis.
Feature extraction is performed using two pre-trained CNN models: VGG16 [10] and Xception [11].These CNN models are adept at extracting rich feature representations from the preprocessed DXA images, capturing both global and fine-grained details crucial for accurate prediction.Alongside CNN-based feature extraction, 2D shape measurements and texture features from the DXA images using specialized packages were computed.Specifically, the shape measurements are computed using the IMEA package [12], which assess the 2D geometric characteristics of the femur region.Similarly, texture features are extracted using the PyRadiomics package [13], enabling the capture of detailed textural information from the DXA images.Subsequently, the extracted features from the CNN models, shape measurements, texture features and clinical data are combined to form a comprehensive feature set for model evaluation.This integrated feature set provides a holistic representation of both anatomical and clinical aspects relevant to hip fracture prediction.
To mitigate dimensionality and enhance model interpretability, feature selection techniques such as univariate feature selection via near zero variance filtering and correlation filtering are employed.Furthermore, Recursive Feature Elimination (RFE) [14] was used to identify the most relevant features in a multivariate fashion.These methods identify the most relevant features for predicting the target variable, ensuring that only informative features are retained for analysis.Bootstrapping is utilized to create diverse ensembles of models for each stage of the sequential modeling process.Each ensemble model is trained on a resampled subset of the data, promoting robustness, and capturing variations within the dataset.A sequential model is then constructed to integrate predictions from different stages, leveraging the strengths of multiple collections of submodels.
In the evaluation of our model, a nested cross-validation structure was employed (Figure 2.).Initially, the dataset was divided into an outer training fold comprising 492 samples and an outer test fold with 55 samples.From the training fold, two separate validation sets were extracted, each containing 45 samples.To preprocess the data and mitigate outliers, centering/scaling and spatial sign techniques were applied.Next, the two validation sets were sliced from the original training fold to fine-tune the hyperparameters of the staged model.These hyperparameters, including standard deviation threshold and midway thresholds, govern the transition from stage 1 to stage 2 in the model, i.e., whether a patient will need to acquire DXA images for a new prediction.
Inner cross-validation was conducted on the remaining training data to optimize the hyperparameters of ensemble models, such as the number of base models comprising the ensembles, the percentage of random samples for training each base model, and the respective base models hyperparameters.This process was repeated twice, once for stage 1 data (clinical features) and once for stage 2 data (clinical and DXA image features), resulting in two ensemble models: ensemble 1 and ensemble 2.
After training the ensemble models separately, the first validation set was utilized to fine-tune the hyperparameter thresholds of the staged model.These thresholds were optimized to achieve a balanced trade-off between predictions retained from Ensemble 1 and those cascading into Ensemble 2, using a scaled weighted Area Under the Curve (AUC) metric.Finally, the bestperforming staged model was evaluated using the outer test set to ensure its robustness and generalization.

STATISTICAL ANALYSIS
In this study, comprehensive statistical analyses were performed to evaluate model performance and feature associations with hip fracture risk.The DeLong test was used to compare ROC curves and McNemar's test assessed sensitivity and specificity variations.Chi-square test and Fisher's exact tests revealed the associations between categorical variables and fracture risk, while t-tests highlighted differences in continuous variables between fracture groups.

RESULTS
The study cohort comprised 547 patients, with 94 individuals having previously experienced hip fractures.An initial assessment revealed that 54.49% of the patients did not require DXA scanning, while 45.52% did.Patients were classified as not requiring DXA scanning based on the absence of significant risk factors such as younger age, no history of fractures, absence of clinical risk factors for osteoporosis (e.g., history of smoking, excessive alcohol consumption), and initial clinical assessments indicating low risk.The distribution of patients not requiring DXA was characterized by the following percentiles: 25th percentile at 35.45%, 50th percentile at 46.78%, and 75th percentile at 59.55%.Table 1 represents the performance metrics of the models employed in this study.Notably, Ensemble 2 emerged as the frontrunner with the highest AUC of 0.95 (95% CI: 0.87-1.00),followed closely by the staged model at 0.85 (95% CI: 0.78-0.92).Ensemble 1 exhibited a comparatively lower AUC of 0.70 (95% CI: 0.55-0.85).These findings underline the superior predictive performance of Ensemble 2 and the staged model in fracture risk assessment.Diving deeper into accuracy and specificity, the staged model showcased superior performance, with accuracy reaching 86.11% and specificity peaking at 92.49%.Although Ensemble 2 displayed commendable sensitivity (80.78%), its accuracy and specificity were outmatched by the staged model.Additionally, fracture risk assessment tool (FRAX) with BMD and FRAX without BMD yielded AUC scores of 0.7577 and 0.6185, respectively.Importantly, all models showcased marked improvements over the guideline model, signifying the efficacy of advanced ML approaches in fracture risk assessment.
Furthermore, the analysis involved rigorous statistical testing to compare the performance of various models.DeLong tests and McNemar's sensitivity and specificity test were utilized, revealing significant differences between the models.Confidence intervals for the DeLong tests were computed, indicating the range of AUC values with 95% confidence.For example, the staged model had a 95% CI of 0.8083-0.8893,Ensemble 1 had a 95% CI of 0.4882-0.6135,Ensemble 2 had a 95% CI of 0.9388-0.976,FRAX with BMD had a 95% CI of 0.7002-0.8151,and FRAX without BMD had a 95% CI of 0.551-0.686.Additionally, the DeLong tests yielded p-values, indicating the significance of the differences in AUC between different model pairs.For instance, the p-value for comparing Staged vs. Ensemble 1 was <0.001, and for comparing Staged vs. FRAX with BMD it was 0.0041.McNemar's sensitivity and specificity test also provided insights, with p-values indicating the significance of differences in sensitivity and specificity between model pairs, such as between Staged and Boot1 (sensitivity: <0.0001, specificity: <0.0001).
Additionally, the study identified significant associations between categorical variables (Table 2) like alcohol consumption and average household income with fracture risk, as well as notable differences in continuous variables such as age and various BMD measurements among patient groups (Table 3).Baseline statistics including p-values from chi-square, Fisher , and ttests for categorical (Table 4) and continuous (Table 5) variables were also calculated.These findings underscore the potential of ensemble learning and staged modeling in enhancing hip-fracture risk assessment, offering insights for clinical decision-making and preventive strategies.
To visually encapsulate the findings, the AUC curves of the ensemble stage 1 (Figure 3A), ensemble stage 2 (Figure 3B) and staged (Figure 3C) models are presented.Ensemble 2 emerged as the standout performer, consistently surpassing its counterparts.However, no significant disparities were observed between the staged model and either Ensemble 1 or 2, underscoring the robustness of the staged approach.Figures 4A and 4B further enrich our understanding by highlighting the importance of various features.Ensemble models underscored age, weight, and dietary changes as significant predictors (Figure 4A).Conversely, Ensemble 2 prioritized DXA parameters, such as convex area and projection area, accentuating their role in fracture risk assessment (Figure 5).

DISCUSSION
In our study, we developed a staged based ML model to predict hip fractures, utilizing data obtained from 547 patients, including 94 individuals with a history of hip fractures from the UK Biobank dataset.Ensemble model 1 included only clinical features while ensemble model 2 included DXA image feature along with clinical features.The staged model demonstrated comparable performance to Ensemble 2, which incorporated both clinical and DXA features, with an AUC of 0.85 compared to 0.95, accuracy of 0.86 compared to 0.92, sensitivity of 0.67 compared to 0.80, and specificity of 0.92 compared to 0.94, respectively, however, the staged model only utilized DXA data 45.52% of the time.

AI for hip fracture risk prediction
Recent advancements in hip fracture risk prediction have been marked by a notable transition towards the incorporation of AI and machine learning ML techniques.Researchers such as Twinprai et al. [15] focused on the diagnostic accuracy of a YOLOv4-tiny AI model for classifying hip fractures from radiographic images.Their model achieved a sensitivity of 96.2%, specificity of 94.6%, and accuracy of 95%, significantly outperforming general practitioners and first-year residents, and matching the performance of specialist doctors.This demonstrates the potential of AI in enhancing diagnostic precision and efficiency.Li et al. [16] developed a risk prediction model using a Random Survival Forest (RSF) algorithm to predict long-term mortality post-hip fracture surgery, achieving a C statistic of 0.83 for 30-day and 0.75 for 1-year mortality.Their model identified key risk factors such as post-operative complications, age, and pre-existing conditions, providing a robust framework for predicting patient outcomes over extended periods.Xu et al. [17] utilized three ML models (Random Forest, Extreme Gradient Boosting, and Backpropagation Neural Network) to predict in-hospital mortality in patients with severe femoral neck fractures, achieving AUC values of 0.98, 0.97, and 0.95, respectively.These high AUC values demonstrate the efficacy of ML models in predicting critical outcomes and guiding early clinical decision-making.The integration of AI and ML technologies in hip fracture diagnosis and mortality prediction signifies a significant stride forward in orthopedic care.These advanced models offer enhanced precision and efficiency in clinical decision-making, enabling early detection and personalized treatment strategies for hip fracture patients.Despite these advancements, challenges persist, particularly in integrating multi-modal data and interpreting complex AI-driven models.One significant challenge lies in the reliance on large and diverse datasets for training and validation, which may not always be readily accessible in clinical settings.Moreover, the interpretability of AI-driven models remains a concern, as their complex algorithms often lack transparency, hindering clinicians' understanding of prediction rationales.Additionally, while ML models demonstrate high accuracy and effectiveness in controlled research environments, their real-world applicability and generalizability to diverse patient populations necessitate further exploration and validation.

Staged modeling for hip fracture risk prediction
Our staged approach for hip fracture risk prediction represents a novel methodology aimed at enhancing the accuracy and reliability of fracture risk assessment.Unlike traditional single-stage models, which often rely on a singular set of features for prediction, our approach systematically integrates multiple stages, each tailored to leverage specific types of data.In the first stage of our staged approach, we focus on utilizing clinical variables to build a foundational understanding of each patient's health profile.This initial stage incorporates demographic details, medical history, lifestyle factors, and other relevant clinical indicators to establish a comprehensive baseline for fracture risk assessment.Following the initial clinical assessment, our approach progresses to the ensemble stage 2, where imaging features extracted from hip DXA images are included.By incorporating this additional layer of data, we aim to enrich the predictive capabilities of our model, capturing subtle nuances and anatomical insights that may not be discernible from clinical variables alone.Ensemble 2 emerged as the top-performing model, achieving a high AUC of 0.95.The model also had an average accuracy of 0.9195.Its sensitivity (0.8078) and specificity (0.9427) were notably high.In assessing the performance of the stage 2 model within our staged framework, we scrutinized its AUC alongside corresponding confidence intervals relative to standard deviation percentiles (Figure 5.).As our analysis progressed from left to right along these percentiles which results in a smaller and smaller subset of the data whose patients have higher uncertainty, we notice the performance of the model in terms of AUC decreases.This approach allows us to delve into predictions with higher uncertainty, showcasing that increased uncertainty leads to decreased performance.Moreover, this opens up room to add a third stage potentially to include genetic data [18] or QCT images [19] which are more costly and less available than DXA but can provide more nuanced complementary information in the sequential approach.
One of the key strengths of our staged approach lies in its adaptability and flexibility.The use of internal logic rules allows for dynamic decision-making, determining whether the acquisition of DXA data is necessary based on the information gathered in the initial clinical stage.This ensures that resources are allocated efficiently, with additional imaging studies being performed only when deemed essential for further risk assessment.Moreover, our staged approach offers enhanced interpretability compared to complex AI-driven models.By breaking down the prediction process into distinct stages, clinicians can better understand the rationale behind each decision, facilitating trust and confidence in the model's outputs.

Comparison with FRAX
In comparing the performance of our staged approach for hip fracture risk prediction with the FRAX tool, we observe notable differences in predictive accuracy.Our approach, leveraging a combination of clinical variables and imaging features, achieved an AUC of 0.85, demonstrating superior discriminatory ability compared to FRAX.Specifically, when comparing our approach to FRAX with BMD, which attained an AUC of 0.7577, we find that our model outperformed it significantly.The higher AUC value of our approach indicates enhanced sensitivity and specificity in identifying individuals at risk of hip fractures, thereby improving the overall predictive performance.Similarly, when comparing our approach to FRAX without BMD, which yielded an AUC of 0.6185, our model again exhibited superior performance.Despite FRAX being a widely used tool for fracture risk assessment, our staged approach demonstrated enhanced accuracy and reliability in predicting hip fractures, underscoring the effectiveness of incorporating imaging features alongside clinical variables.

Cost and radiation reduction
The staged approach for hip fracture risk prediction offers a comprehensive strategy that not only enhances diagnostic accuracy but also addresses cost and radiation concerns associated with conventional methods.It revealed that 54.49% of the patients did not require DXA scanning, while 45.52% did.The distribution of patients not requiring DXA was characterized by the following percentiles: 25th percentile at 35.45%, 50th percentile at 46.78%, and 75th percentile at 59.55%.By accurately identifying individuals at high risk of hip fractures, our approach enables targeted intervention and preventive measures, minimizing unnecessary diagnostic tests and treatments for those at lower risk.This tailored approach optimizes resource allocation, leading to significant cost savings within healthcare systems.Furthermore, early detection and intervention facilitated by our approach can prevent costly hip fracture-related complications, such as prolonged hospital stays and postoperative issues, thus reducing overall healthcare expenditure.An essential aspect of our approach is the incorporation of imaging features extracted from existing diagnostic scans, such as DXA images.This eliminates the need for additional imaging tests, thereby minimizing radiation exposure for patients.Leveraging existing imaging data more efficiently not only prioritizes patient safety but also mitigates potential risks associated with excessive radiation exposure, including long-term health consequences.Moreover, our staged modeling approach allows for the selective use of advanced imaging techniques, like DXA scans, based on individual risk profiles derived from clinical data.This targeted approach minimizes the need for unnecessary imaging tests, further reducing radiation exposure and associated costs.Additionally, the integration of clinical as well as DXA imaging data provides a holistic assessment of fracture risk, enhancing diagnostic accuracy and reducing the likelihood of missed diagnoses or unnecessary treatments.

Interpretability of our model
The clinical relevance of our findings is underscored by the identification of significant predictors of hip fracture risk.Our models identified age, weight, dietary changes, and DXA parameters as key predictors, aligning with established literature on fracture risk factors.These findings have the potential to guide clinical decision-making by enabling the early identification of individuals at high risk of fractures, thus facilitating the implementation of tailored interventions to effectively reduce fracture risk.There is potential for further refinement and expansion of our staged modeling approach by incorporating additional features, such as genetic data, to enhance the predictive capabilities of the model.

LIMITATIONS
First, this study involved a relatively small number of subjects, which is an inherent limitation in the study design.Increasing the sample size would improve the statistical power and generalizability of our findings.Additionally, the performance and feasibility of the data-driven system might be influenced by the quality of the data.For instance, inconsistencies or inaccuracies in clinical data and DXA images could impact the model's predictive accuracy.Second, there were missing features in the UKBiobank repository.Not all potential risk factors for hip fractures were captured or included in the analysis.This limitation might have resulted in an incomplete representation of each patient's health profile.Incorporating additional relevant features such as genetic data, comprehensive environmental factors, and more detailed medical history could further refine the model's predictive capabilities.Lastly, this study did not include external validation using datasets from other populations or healthcare settings.

CONCLUSION
We developed a staged approach combining clinical data and DXA hip images for hip fracture risk prediction.By considering various factors like age, weight, and bone health alongside images with machine learning and uncertainty quantification, the model offers a cost-effective holistic view of patients' health.Through rigorous evaluation, we found that our staged approach could identify individuals at risk with a high accuracy but reduce the unnecessary DXA scanning.It has great promise to guide interventions to prevent hip fractures with reduced cost and radiation.

Figure 3A. ROC curves for ensemble stage 1 Figure
Figure 3A.ROC curves for ensemble stage 1

Figure 3B. ROC curves for ensemble stage 2 Figure
Figure 3B.ROC curves for ensemble stage 2

Figure 3C .
Figure 3C.ROC curves for the staged model

Table 1 . Performance metrics of models
Table 1 summarizes the AVG performance metrics, such as AUC, accuracy, sensitivity, and specificity, for the various models.It includes STD values to indicate metric variability across evaluations.AVG: Average, STD: Standard Deviation

Table 3
compares measurements and statistics between two groups: individuals who experienced hip fractures and those who did not.Each row represents a specific feature.The columns show the average value (mean) and the variation (standard deviation) within each group.

Table 4
presents p-values from both the chi-square test and Fisher test for categorical features.The tests evaluate associations between categorical variables and the outcome.

Table 5
displays the p-values obtained from t-tests for categorical features used in the study.The t-tests evaluate differences in means between groups for each variable.